Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 31647 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 20.7 MiB |
| Average record size in memory | 685.2 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 6 |
| Boolean | 4 |
ID is highly overall correlated with housing and 6 other fields | High correlation |
pdays is highly overall correlated with ID and 1 other fields | High correlation |
previous is highly overall correlated with pdays | High correlation |
housing is highly overall correlated with ID and 1 other fields | High correlation |
contact is highly overall correlated with ID and 1 other fields | High correlation |
month is highly overall correlated with ID and 3 other fields | High correlation |
poutcome is highly overall correlated with ID and 1 other fields | High correlation |
age is highly overall correlated with job | High correlation |
job is highly overall correlated with age and 1 other fields | High correlation |
education is highly overall correlated with job | High correlation |
day is highly overall correlated with ID and 1 other fields | High correlation |
subscribed is highly overall correlated with ID | High correlation |
previous is highly skewed (γ1 = 49.30234792) | Skewed |
ID is uniformly distributed | Uniform |
ID has unique values | Unique |
balance has 2470 (7.8%) zeros | Zeros |
previous has 25924 (81.9%) zeros | Zeros |
Reproduction
| Analysis started | 2023-01-12 13:54:23.989565 |
|---|---|
| Analysis finished | 2023-01-12 13:55:13.470051 |
| Duration | 49.48 seconds |
| Software version | pandas-profiling vv3.5.0 |
| Download configuration | config.json |
| Distinct | 31647 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22563.972 |
| Minimum | 2 |
|---|---|
| Maximum | 45211 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2251.9 |
| Q1 | 11218 |
| median | 22519 |
| Q3 | 33879.5 |
| 95-th percentile | 42964.1 |
| Maximum | 45211 |
| Range | 45209 |
| Interquartile range (IQR) | 22661.5 |
Descriptive statistics
| Standard deviation | 13075.937 |
|---|---|
| Coefficient of variation (CV) | 0.5795051 |
| Kurtosis | -1.2043412 |
| Mean | 22563.972 |
| Median Absolute Deviation (MAD) | 11330 |
| Skewness | 0.0058507956 |
| Sum | 7.1408203 × 108 |
| Variance | 1.7098013 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 26110 | 1 | < 0.1% |
| 13339 | 1 | < 0.1% |
| 39681 | 1 | < 0.1% |
| 15135 | 1 | < 0.1% |
| 26037 | 1 | < 0.1% |
| 41484 | 1 | < 0.1% |
| 2281 | 1 | < 0.1% |
| 31869 | 1 | < 0.1% |
| 42096 | 1 | < 0.1% |
| 15737 | 1 | < 0.1% |
| Other values (31637) | 31637 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 |
| Value | Count | Frequency (%) |
| 45211 | 1 | |
| 45210 | 1 | |
| 45209 | 1 | |
| 45208 | 1 | |
| 45207 | 1 | |
| 45205 | 1 | |
| 45204 | 1 | |
| 45203 | 1 | |
| 45200 | 1 | |
| 45199 | 1 |
age
Real number (ℝ)
| Distinct | 76 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.957247 |
| Minimum | 18 |
|---|---|
| Maximum | 95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 27 |
| Q1 | 33 |
| median | 39 |
| Q3 | 48 |
| 95-th percentile | 59 |
| Maximum | 95 |
| Range | 77 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 10.625134 |
|---|---|
| Coefficient of variation (CV) | 0.25942013 |
| Kurtosis | 0.29797526 |
| Mean | 40.957247 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.68160678 |
| Sum | 1296174 |
| Variance | 112.89348 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 32 | 1457 | 4.6% |
| 31 | 1417 | 4.5% |
| 33 | 1406 | 4.4% |
| 34 | 1321 | 4.2% |
| 35 | 1314 | 4.2% |
| 36 | 1245 | 3.9% |
| 30 | 1219 | 3.9% |
| 37 | 1181 | 3.7% |
| 39 | 1079 | 3.4% |
| 38 | 985 | 3.1% |
| Other values (66) | 19023 |
| Value | Count | Frequency (%) |
| 18 | 8 | < 0.1% |
| 19 | 22 | 0.1% |
| 20 | 39 | 0.1% |
| 21 | 48 | 0.2% |
| 22 | 86 | 0.3% |
| 23 | 142 | 0.4% |
| 24 | 212 | 0.7% |
| 25 | 366 | |
| 26 | 564 | |
| 27 | 627 |
| Value | Count | Frequency (%) |
| 95 | 1 | < 0.1% |
| 94 | 1 | < 0.1% |
| 93 | 1 | < 0.1% |
| 92 | 1 | < 0.1% |
| 90 | 1 | < 0.1% |
| 89 | 2 | < 0.1% |
| 88 | 2 | < 0.1% |
| 87 | 2 | < 0.1% |
| 86 | 8 | |
| 84 | 5 |
job
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| blue-collar | |
|---|---|
| management | |
| technician | |
| admin. | |
| services | |
| Other values (7) |
Length
| Max length | 13 |
|---|---|
| Median length | 12 |
| Mean length | 9.487408 |
| Min length | 6 |
Characters and Unicode
| Total characters | 300248 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | admin. |
|---|---|
| 2nd row | unknown |
| 3rd row | services |
| 4th row | management |
| 5th row | technician |
Common Values
| Value | Count | Frequency (%) |
| blue-collar | 6842 | |
| management | 6639 | |
| technician | 5307 | |
| admin. | 3631 | |
| services | 2903 | |
| retired | 1574 | 5.0% |
| self-employed | 1123 | 3.5% |
| entrepreneur | 1008 | 3.2% |
| unemployed | 905 | 2.9% |
| housemaid | 874 | 2.8% |
| Other values (2) | 841 | 2.7% |
Length
| Value | Count | Frequency (%) |
| blue-collar | 6842 | |
| management | 6639 | |
| technician | 5307 | |
| admin | 3631 | |
| services | 2903 | |
| retired | 1574 | 5.0% |
| self-employed | 1123 | 3.5% |
| entrepreneur | 1008 | 3.2% |
| unemployed | 905 | 2.9% |
| housemaid | 874 | 2.8% |
| Other values (2) | 841 | 2.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 45101 | |
| n | 31697 | |
| a | 29932 | |
| l | 23677 | 7.9% |
| c | 20359 | 6.8% |
| m | 19811 | 6.6% |
| i | 19596 | 6.5% |
| r | 15917 | 5.3% |
| t | 15798 | 5.3% |
| u | 10470 | 3.5% |
| Other values (14) | 67890 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 288652 | |
| Dash Punctuation | 7965 | 2.7% |
| Other Punctuation | 3631 | 1.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 45101 | |
| n | 31697 | |
| a | 29932 | |
| l | 23677 | |
| c | 20359 | 7.1% |
| m | 19811 | 6.9% |
| i | 19596 | 6.8% |
| r | 15917 | 5.5% |
| t | 15798 | 5.5% |
| u | 10470 | 3.6% |
| Other values (12) | 56294 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 7965 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3631 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 288652 | |
| Common | 11596 | 3.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 45101 | |
| n | 31697 | |
| a | 29932 | |
| l | 23677 | |
| c | 20359 | 7.1% |
| m | 19811 | 6.9% |
| i | 19596 | 6.8% |
| r | 15917 | 5.5% |
| t | 15798 | 5.5% |
| u | 10470 | 3.6% |
| Other values (12) | 56294 |
Common
| Value | Count | Frequency (%) |
| - | 7965 | |
| . | 3631 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 300248 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 45101 | |
| n | 31697 | |
| a | 29932 | |
| l | 23677 | 7.9% |
| c | 20359 | 6.8% |
| m | 19811 | 6.6% |
| i | 19596 | 6.5% |
| r | 15917 | 5.3% |
| t | 15798 | 5.3% |
| u | 10470 | 3.5% |
| Other values (14) | 67890 |
marital
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| married | |
|---|---|
| single | |
| divorced |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.8327804 |
| Min length | 6 |
Characters and Unicode
| Total characters | 216237 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | married |
|---|---|
| 2nd row | married |
| 3rd row | married |
| 4th row | divorced |
| 5th row | married |
Common Values
| Value | Count | Frequency (%) |
| married | 19095 | |
| single | 8922 | |
| divorced | 3630 | 11.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| married | 19095 | |
| single | 8922 | |
| divorced | 3630 | 11.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 41820 | |
| i | 31647 | |
| e | 31647 | |
| d | 26355 | |
| m | 19095 | |
| a | 19095 | |
| s | 8922 | 4.1% |
| n | 8922 | 4.1% |
| g | 8922 | 4.1% |
| l | 8922 | 4.1% |
| Other values (3) | 10890 | 5.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 216237 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 41820 | |
| i | 31647 | |
| e | 31647 | |
| d | 26355 | |
| m | 19095 | |
| a | 19095 | |
| s | 8922 | 4.1% |
| n | 8922 | 4.1% |
| g | 8922 | 4.1% |
| l | 8922 | 4.1% |
| Other values (3) | 10890 | 5.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 216237 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 41820 | |
| i | 31647 | |
| e | 31647 | |
| d | 26355 | |
| m | 19095 | |
| a | 19095 | |
| s | 8922 | 4.1% |
| n | 8922 | 4.1% |
| g | 8922 | 4.1% |
| l | 8922 | 4.1% |
| Other values (3) | 10890 | 5.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 216237 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 41820 | |
| i | 31647 | |
| e | 31647 | |
| d | 26355 | |
| m | 19095 | |
| a | 19095 | |
| s | 8922 | 4.1% |
| n | 8922 | 4.1% |
| g | 8922 | 4.1% |
| l | 8922 | 4.1% |
| Other values (3) | 10890 | 5.0% |
education
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| secondary | |
|---|---|
| tertiary | |
| primary | |
| unknown | 1314 |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.3192088 |
| Min length | 7 |
Characters and Unicode
| Total characters | 263278 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | unknown |
|---|---|
| 2nd row | secondary |
| 3rd row | secondary |
| 4th row | tertiary |
| 5th row | secondary |
Common Values
| Value | Count | Frequency (%) |
| secondary | 16224 | |
| tertiary | 9301 | |
| primary | 4808 | 15.2% |
| unknown | 1314 | 4.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| secondary | 16224 | |
| tertiary | 9301 | |
| primary | 4808 | 15.2% |
| unknown | 1314 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 44442 | |
| a | 30333 | |
| y | 30333 | |
| e | 25525 | |
| n | 20166 | |
| t | 18602 | |
| o | 17538 | 6.7% |
| s | 16224 | 6.2% |
| c | 16224 | 6.2% |
| d | 16224 | 6.2% |
| Other values (6) | 27667 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 263278 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 44442 | |
| a | 30333 | |
| y | 30333 | |
| e | 25525 | |
| n | 20166 | |
| t | 18602 | |
| o | 17538 | 6.7% |
| s | 16224 | 6.2% |
| c | 16224 | 6.2% |
| d | 16224 | 6.2% |
| Other values (6) | 27667 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 263278 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 44442 | |
| a | 30333 | |
| y | 30333 | |
| e | 25525 | |
| n | 20166 | |
| t | 18602 | |
| o | 17538 | 6.7% |
| s | 16224 | 6.2% |
| c | 16224 | 6.2% |
| d | 16224 | 6.2% |
| Other values (6) | 27667 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 263278 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 44442 | |
| a | 30333 | |
| y | 30333 | |
| e | 25525 | |
| n | 20166 | |
| t | 18602 | |
| o | 17538 | 6.7% |
| s | 16224 | 6.2% |
| c | 16224 | 6.2% |
| d | 16224 | 6.2% |
| Other values (6) | 27667 |
default
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 31.0 KiB |
| False | |
|---|---|
| True | 585 |
| Value | Count | Frequency (%) |
| False | 31062 | |
| True | 585 | 1.8% |
balance
Real number (ℝ)
| Distinct | 6326 |
|---|---|
| Distinct (%) | 20.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1363.8903 |
| Minimum | -8019 |
|---|---|
| Maximum | 102127 |
| Zeros | 2470 |
| Zeros (%) | 7.8% |
| Negative | 2665 |
| Negative (%) | 8.4% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | -8019 |
|---|---|
| 5-th percentile | -173 |
| Q1 | 73 |
| median | 450 |
| Q3 | 1431 |
| 95-th percentile | 5768 |
| Maximum | 102127 |
| Range | 110146 |
| Interquartile range (IQR) | 1358 |
Descriptive statistics
| Standard deviation | 3028.3043 |
|---|---|
| Coefficient of variation (CV) | 2.2203431 |
| Kurtosis | 126.45128 |
| Mean | 1363.8903 |
| Median Absolute Deviation (MAD) | 450 |
| Skewness | 7.9956956 |
| Sum | 43163035 |
| Variance | 9170626.9 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2470 | 7.8% |
| 1 | 137 | 0.4% |
| 2 | 109 | 0.3% |
| 4 | 95 | 0.3% |
| 3 | 88 | 0.3% |
| 5 | 78 | 0.2% |
| 6 | 62 | 0.2% |
| 8 | 52 | 0.2% |
| 10 | 48 | 0.2% |
| 23 | 48 | 0.2% |
| Other values (6316) | 28460 |
| Value | Count | Frequency (%) |
| -8019 | 1 | |
| -6847 | 1 | |
| -4057 | 1 | |
| -3372 | 1 | |
| -3058 | 1 | |
| -2712 | 1 | |
| -2604 | 1 | |
| -2282 | 1 | |
| -2122 | 1 | |
| -2082 | 1 |
| Value | Count | Frequency (%) |
| 102127 | 1 | |
| 81204 | 1 | |
| 66721 | 1 | |
| 66653 | 1 | |
| 58932 | 1 | |
| 58544 | 1 | |
| 57435 | 1 | |
| 56831 | 1 | |
| 52587 | 2 | |
| 52527 | 1 |
housing
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 31.0 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 17584 | |
| False | 14063 |
loan
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 31.0 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 26516 | |
| True | 5131 | 16.2% |
contact
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| cellular | |
|---|---|
| unknown | |
| telephone |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 7.7747022 |
| Min length | 7 |
Characters and Unicode
| Total characters | 246046 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | telephone |
|---|---|
| 2nd row | cellular |
| 3rd row | cellular |
| 4th row | cellular |
| 5th row | cellular |
Common Values
| Value | Count | Frequency (%) |
| cellular | 20423 | |
| unknown | 9177 | |
| telephone | 2047 | 6.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| cellular | 20423 | |
| unknown | 9177 | |
| telephone | 2047 | 6.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 63316 | |
| u | 29600 | |
| n | 29578 | |
| e | 26564 | |
| c | 20423 | 8.3% |
| a | 20423 | 8.3% |
| r | 20423 | 8.3% |
| o | 11224 | 4.6% |
| k | 9177 | 3.7% |
| w | 9177 | 3.7% |
| Other values (3) | 6141 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 246046 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 63316 | |
| u | 29600 | |
| n | 29578 | |
| e | 26564 | |
| c | 20423 | 8.3% |
| a | 20423 | 8.3% |
| r | 20423 | 8.3% |
| o | 11224 | 4.6% |
| k | 9177 | 3.7% |
| w | 9177 | 3.7% |
| Other values (3) | 6141 | 2.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 246046 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 63316 | |
| u | 29600 | |
| n | 29578 | |
| e | 26564 | |
| c | 20423 | 8.3% |
| a | 20423 | 8.3% |
| r | 20423 | 8.3% |
| o | 11224 | 4.6% |
| k | 9177 | 3.7% |
| w | 9177 | 3.7% |
| Other values (3) | 6141 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 246046 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| l | 63316 | |
| u | 29600 | |
| n | 29578 | |
| e | 26564 | |
| c | 20423 | 8.3% |
| a | 20423 | 8.3% |
| r | 20423 | 8.3% |
| o | 11224 | 4.6% |
| k | 9177 | 3.7% |
| w | 9177 | 3.7% |
| Other values (3) | 6141 | 2.5% |
day
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.835466 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 8 |
| median | 16 |
| Q3 | 21 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 8.3370967 |
|---|---|
| Coefficient of variation (CV) | 0.52648255 |
| Kurtosis | -1.067397 |
| Mean | 15.835466 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.087185435 |
| Sum | 501145 |
| Variance | 69.507182 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 1909 | 6.0% |
| 18 | 1612 | 5.1% |
| 21 | 1445 | 4.6% |
| 5 | 1373 | 4.3% |
| 6 | 1348 | 4.3% |
| 17 | 1344 | 4.2% |
| 14 | 1283 | 4.1% |
| 8 | 1281 | 4.0% |
| 28 | 1276 | 4.0% |
| 29 | 1241 | 3.9% |
| Other values (21) | 17535 |
| Value | Count | Frequency (%) |
| 1 | 220 | 0.7% |
| 2 | 900 | |
| 3 | 761 | |
| 4 | 1016 | |
| 5 | 1373 | |
| 6 | 1348 | |
| 7 | 1240 | |
| 8 | 1281 | |
| 9 | 1097 | |
| 10 | 360 | 1.1% |
| Value | Count | Frequency (%) |
| 31 | 460 | 1.5% |
| 30 | 1082 | |
| 29 | 1241 | |
| 28 | 1276 | |
| 27 | 804 | |
| 26 | 761 | |
| 25 | 586 | |
| 24 | 305 | 1.0% |
| 23 | 657 | |
| 22 | 640 |
month
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.8 MiB |
| may | |
|---|---|
| jul | |
| aug | |
| jun | |
| nov | |
| Other values (7) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 94941 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | nov |
|---|---|
| 2nd row | jul |
| 3rd row | jul |
| 4th row | jun |
| 5th row | feb |
Common Values
| Value | Count | Frequency (%) |
| may | 9669 | |
| jul | 4844 | |
| aug | 4333 | |
| jun | 3738 | 11.8% |
| nov | 2783 | 8.8% |
| apr | 2055 | 6.5% |
| feb | 1827 | 5.8% |
| jan | 977 | 3.1% |
| oct | 512 | 1.6% |
| sep | 410 | 1.3% |
| Other values (2) | 499 | 1.6% |
Length
| Value | Count | Frequency (%) |
| may | 9669 | |
| jul | 4844 | |
| aug | 4333 | |
| jun | 3738 | 11.8% |
| nov | 2783 | 8.8% |
| apr | 2055 | 6.5% |
| feb | 1827 | 5.8% |
| jan | 977 | 3.1% |
| oct | 512 | 1.6% |
| sep | 410 | 1.3% |
| Other values (2) | 499 | 1.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 17376 | |
| u | 12915 | |
| m | 10011 | |
| y | 9669 | |
| j | 9559 | |
| n | 7498 | |
| l | 4844 | 5.1% |
| g | 4333 | 4.6% |
| o | 3295 | 3.5% |
| v | 2783 | 2.9% |
| Other values (9) | 12658 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 94941 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 17376 | |
| u | 12915 | |
| m | 10011 | |
| y | 9669 | |
| j | 9559 | |
| n | 7498 | |
| l | 4844 | 5.1% |
| g | 4333 | 4.6% |
| o | 3295 | 3.5% |
| v | 2783 | 2.9% |
| Other values (9) | 12658 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 94941 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 17376 | |
| u | 12915 | |
| m | 10011 | |
| y | 9669 | |
| j | 9559 | |
| n | 7498 | |
| l | 4844 | 5.1% |
| g | 4333 | 4.6% |
| o | 3295 | 3.5% |
| v | 2783 | 2.9% |
| Other values (9) | 12658 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 94941 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 17376 | |
| u | 12915 | |
| m | 10011 | |
| y | 9669 | |
| j | 9559 | |
| n | 7498 | |
| l | 4844 | 5.1% |
| g | 4333 | 4.6% |
| o | 3295 | 3.5% |
| v | 2783 | 2.9% |
| Other values (9) | 12658 |
duration
Real number (ℝ)
| Distinct | 1454 |
|---|---|
| Distinct (%) | 4.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 258.11353 |
| Minimum | 0 |
|---|---|
| Maximum | 4918 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 36 |
| Q1 | 104 |
| median | 180 |
| Q3 | 318.5 |
| 95-th percentile | 752 |
| Maximum | 4918 |
| Range | 4918 |
| Interquartile range (IQR) | 214.5 |
Descriptive statistics
| Standard deviation | 257.11897 |
|---|---|
| Coefficient of variation (CV) | 0.99614681 |
| Kurtosis | 19.487627 |
| Mean | 258.11353 |
| Median Absolute Deviation (MAD) | 93 |
| Skewness | 3.1997657 |
| Sum | 8168519 |
| Variance | 66110.166 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 135 | 0.4% |
| 124 | 130 | 0.4% |
| 139 | 127 | 0.4% |
| 88 | 127 | 0.4% |
| 104 | 127 | 0.4% |
| 112 | 125 | 0.4% |
| 76 | 125 | 0.4% |
| 135 | 124 | 0.4% |
| 136 | 123 | 0.4% |
| 166 | 123 | 0.4% |
| Other values (1444) | 30381 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 3 | < 0.1% |
| 4 | 11 | < 0.1% |
| 5 | 20 | 0.1% |
| 6 | 32 | |
| 7 | 43 | |
| 8 | 60 | |
| 9 | 61 | |
| 10 | 49 |
| Value | Count | Frequency (%) |
| 4918 | 1 | |
| 3881 | 1 | |
| 3785 | 1 | |
| 3422 | 1 | |
| 3366 | 1 | |
| 3322 | 1 | |
| 3284 | 1 | |
| 3183 | 1 | |
| 3102 | 1 | |
| 3076 | 1 |
campaign
Real number (ℝ)
| Distinct | 45 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.7656966 |
| Minimum | 1 |
|---|---|
| Maximum | 63 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 8 |
| Maximum | 63 |
| Range | 62 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 3.11383 |
|---|---|
| Coefficient of variation (CV) | 1.1258755 |
| Kurtosis | 38.057995 |
| Mean | 2.7656966 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.8739349 |
| Sum | 87526 |
| Variance | 9.6959376 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 12262 | |
| 2 | 8798 | |
| 3 | 3858 | 12.2% |
| 4 | 2442 | 7.7% |
| 5 | 1245 | 3.9% |
| 6 | 916 | 2.9% |
| 7 | 518 | 1.6% |
| 8 | 356 | 1.1% |
| 9 | 236 | 0.7% |
| 10 | 184 | 0.6% |
| Other values (35) | 832 | 2.6% |
| Value | Count | Frequency (%) |
| 1 | 12262 | |
| 2 | 8798 | |
| 3 | 3858 | 12.2% |
| 4 | 2442 | 7.7% |
| 5 | 1245 | 3.9% |
| 6 | 916 | 2.9% |
| 7 | 518 | 1.6% |
| 8 | 356 | 1.1% |
| 9 | 236 | 0.7% |
| 10 | 184 | 0.6% |
| Value | Count | Frequency (%) |
| 63 | 1 | < 0.1% |
| 55 | 1 | < 0.1% |
| 50 | 1 | < 0.1% |
| 44 | 1 | < 0.1% |
| 43 | 3 | |
| 41 | 1 | < 0.1% |
| 39 | 1 | < 0.1% |
| 38 | 3 | |
| 37 | 2 | |
| 36 | 1 | < 0.1% |
pdays
Real number (ℝ)
| Distinct | 509 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.576042 |
| Minimum | -1 |
|---|---|
| Maximum | 871 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 25924 |
| Negative (%) | 81.9% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | -1 |
| median | -1 |
| Q3 | -1 |
| 95-th percentile | 313 |
| Maximum | 871 |
| Range | 872 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 99.317592 |
|---|---|
| Coefficient of variation (CV) | 2.5095383 |
| Kurtosis | 7.1112947 |
| Mean | 39.576042 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.6423742 |
| Sum | 1252463 |
| Variance | 9863.9842 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -1 | 25924 | |
| 182 | 118 | 0.4% |
| 92 | 100 | 0.3% |
| 91 | 87 | 0.3% |
| 183 | 85 | 0.3% |
| 181 | 75 | 0.2% |
| 370 | 65 | 0.2% |
| 184 | 62 | 0.2% |
| 95 | 54 | 0.2% |
| 350 | 51 | 0.2% |
| Other values (499) | 5026 | 15.9% |
| Value | Count | Frequency (%) |
| -1 | 25924 | |
| 1 | 11 | < 0.1% |
| 2 | 25 | 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 16 | 0.1% |
| 9 | 8 | < 0.1% |
| 10 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 871 | 1 | |
| 854 | 1 | |
| 842 | 1 | |
| 838 | 1 | |
| 805 | 1 | |
| 804 | 1 | |
| 792 | 2 | |
| 791 | 1 | |
| 784 | 1 | |
| 782 | 1 |
| Distinct | 38 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.57427244 |
| Minimum | 0 |
|---|---|
| Maximum | 275 |
| Zeros | 25924 |
| Zeros (%) | 81.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 247.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 3 |
| Maximum | 275 |
| Range | 275 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 2.4225289 |
|---|---|
| Coefficient of variation (CV) | 4.2184313 |
| Kurtosis | 5236.4116 |
| Mean | 0.57427244 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 49.302348 |
| Sum | 18174 |
| Variance | 5.868646 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 25924 | |
| 1 | 1921 | 6.1% |
| 2 | 1481 | 4.7% |
| 3 | 780 | 2.5% |
| 4 | 501 | 1.6% |
| 5 | 311 | 1.0% |
| 6 | 188 | 0.6% |
| 7 | 138 | 0.4% |
| 8 | 81 | 0.3% |
| 9 | 64 | 0.2% |
| Other values (28) | 258 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 25924 | |
| 1 | 1921 | 6.1% |
| 2 | 1481 | 4.7% |
| 3 | 780 | 2.5% |
| 4 | 501 | 1.6% |
| 5 | 311 | 1.0% |
| 6 | 188 | 0.6% |
| 7 | 138 | 0.4% |
| 8 | 81 | 0.3% |
| 9 | 64 | 0.2% |
| Value | Count | Frequency (%) |
| 275 | 1 | |
| 58 | 1 | |
| 41 | 1 | |
| 38 | 1 | |
| 37 | 1 | |
| 35 | 1 | |
| 32 | 1 | |
| 30 | 1 | |
| 29 | 2 | |
| 28 | 1 |
poutcome
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| unknown | |
|---|---|
| failure | |
| other | 1288 |
| success | 1068 |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.9186021 |
| Min length | 5 |
Characters and Unicode
| Total characters | 218953 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | unknown |
|---|---|
| 2nd row | unknown |
| 3rd row | unknown |
| 4th row | success |
| 5th row | unknown |
Common Values
| Value | Count | Frequency (%) |
| unknown | 25929 | |
| failure | 3362 | 10.6% |
| other | 1288 | 4.1% |
| success | 1068 | 3.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 25929 | |
| failure | 3362 | 10.6% |
| other | 1288 | 4.1% |
| success | 1068 | 3.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 77787 | |
| u | 30359 | 13.9% |
| o | 27217 | 12.4% |
| k | 25929 | 11.8% |
| w | 25929 | 11.8% |
| e | 5718 | 2.6% |
| r | 4650 | 2.1% |
| f | 3362 | 1.5% |
| a | 3362 | 1.5% |
| i | 3362 | 1.5% |
| Other values (5) | 11278 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 218953 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 77787 | |
| u | 30359 | 13.9% |
| o | 27217 | 12.4% |
| k | 25929 | 11.8% |
| w | 25929 | 11.8% |
| e | 5718 | 2.6% |
| r | 4650 | 2.1% |
| f | 3362 | 1.5% |
| a | 3362 | 1.5% |
| i | 3362 | 1.5% |
| Other values (5) | 11278 | 5.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 218953 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 77787 | |
| u | 30359 | 13.9% |
| o | 27217 | 12.4% |
| k | 25929 | 11.8% |
| w | 25929 | 11.8% |
| e | 5718 | 2.6% |
| r | 4650 | 2.1% |
| f | 3362 | 1.5% |
| a | 3362 | 1.5% |
| i | 3362 | 1.5% |
| Other values (5) | 11278 | 5.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 218953 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 77787 | |
| u | 30359 | 13.9% |
| o | 27217 | 12.4% |
| k | 25929 | 11.8% |
| w | 25929 | 11.8% |
| e | 5718 | 2.6% |
| r | 4650 | 2.1% |
| f | 3362 | 1.5% |
| a | 3362 | 1.5% |
| i | 3362 | 1.5% |
| Other values (5) | 11278 | 5.2% |
subscribed
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 31.0 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 27932 | |
| True | 3715 | 11.7% |
Auto
The auto setting is an interpretable pairwise column metric of the following mapping:- Variable_type-Variable_type : Method, Range
- Categorical-Categorical : Cramer's V, [0,1]
- Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
- Numerical-Numerical : Spearman's ρ, [-1,1]
This configuration uses the recommended metric for each pair of columns.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| ID | age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | subscribed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 26110 | 56 | admin. | married | unknown | no | 1933 | no | no | telephone | 19 | nov | 44 | 2 | -1 | 0 | unknown | no |
| 1 | 40576 | 31 | unknown | married | secondary | no | 3 | no | no | cellular | 20 | jul | 91 | 2 | -1 | 0 | unknown | no |
| 2 | 15320 | 27 | services | married | secondary | no | 891 | yes | no | cellular | 18 | jul | 240 | 1 | -1 | 0 | unknown | no |
| 3 | 43962 | 57 | management | divorced | tertiary | no | 3287 | no | no | cellular | 22 | jun | 867 | 1 | 84 | 3 | success | yes |
| 4 | 29842 | 31 | technician | married | secondary | no | 119 | yes | no | cellular | 4 | feb | 380 | 1 | -1 | 0 | unknown | no |
| 5 | 29390 | 33 | management | single | tertiary | no | 0 | yes | no | cellular | 2 | feb | 116 | 3 | -1 | 0 | unknown | no |
| 6 | 40444 | 56 | retired | married | secondary | no | 1044 | no | no | telephone | 3 | jul | 353 | 2 | -1 | 0 | unknown | yes |
| 7 | 40194 | 50 | technician | single | secondary | no | 1811 | no | no | cellular | 8 | jun | 97 | 4 | -1 | 0 | unknown | no |
| 8 | 29824 | 45 | blue-collar | divorced | secondary | no | 1951 | yes | no | cellular | 4 | feb | 692 | 1 | -1 | 0 | unknown | no |
| 9 | 44676 | 35 | admin. | married | secondary | no | 1204 | no | no | cellular | 3 | sep | 789 | 2 | -1 | 0 | unknown | no |
| ID | age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | subscribed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 31637 | 20110 | 44 | technician | married | secondary | no | 5163 | no | no | cellular | 11 | aug | 48 | 2 | -1 | 0 | unknown | no |
| 31638 | 16309 | 29 | blue-collar | married | secondary | no | 721 | yes | no | cellular | 23 | jul | 644 | 1 | -1 | 0 | unknown | no |
| 31639 | 279 | 38 | services | single | secondary | no | 570 | yes | no | unknown | 5 | may | 75 | 2 | -1 | 0 | unknown | no |
| 31640 | 12109 | 43 | management | single | secondary | no | 2968 | no | no | unknown | 20 | jun | 30 | 4 | -1 | 0 | unknown | no |
| 31641 | 9476 | 37 | technician | single | tertiary | no | 1309 | no | no | unknown | 6 | jun | 442 | 2 | -1 | 0 | unknown | no |
| 31642 | 36483 | 29 | management | single | tertiary | no | 0 | yes | no | cellular | 12 | may | 116 | 2 | -1 | 0 | unknown | no |
| 31643 | 40178 | 53 | management | divorced | tertiary | no | 380 | no | yes | cellular | 5 | jun | 438 | 2 | -1 | 0 | unknown | yes |
| 31644 | 19710 | 32 | management | single | tertiary | no | 312 | no | no | cellular | 7 | aug | 37 | 3 | -1 | 0 | unknown | no |
| 31645 | 38556 | 57 | technician | married | secondary | no | 225 | yes | no | telephone | 15 | may | 22 | 7 | 337 | 12 | failure | no |
| 31646 | 14156 | 55 | management | divorced | secondary | no | 204 | yes | no | cellular | 11 | jul | 1973 | 2 | -1 | 0 | unknown | yes |